Grammar-based Compression of Unranked Trees

نویسندگان

  • Adrià Gascón
  • Markus Lohrey
  • Sebastian Maneth
  • Carl Philipp Reh
  • Kurt Sieber
چکیده

We introduce forest straight-line programs (FSLPs) as a compressed representation of unranked ordered node-labelled trees. FSLPs are based on the operations of forest algebra and generalize tree straight-line programs. We compare the succinctness of FSLPs with two other compression schemes for unranked trees: top dags and tree straight-line programs of first-child/next sibling encodings. Efficient translations between these formalisms are provided. Finally, we show that equality of unranked trees in the setting where certain symbols are associative or commutative can be tested in polynomial time. This generalizes previous results for testing isomorphism of compressed unordered ranked trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dictionary-Based Tree Compression

Trees are a ubiquitous data structure in computer science. LISP, for instance, was designed to manipulate nested lists, that is, ordered unranked trees. Already at that time, DAGs were used to detect common subexpression, a process known as “hash consing.” In a DAG every distinct subtree is represented only once (but can be referenced many times) and hence it constitutes a dictionary-based comp...

متن کامل

Logical Definability and Query Languages over Unranked Trees

Unranked trees, that is, trees with no restriction on the number of children of nodes, have recently attracted much attention, primarily as an abstraction of XML documents. In this paper, we study logical definability over unranked trees, as well as collections of unranked trees, that can be viewed as databases of XML documents. The traditional approach to definability is to view each tree as a...

متن کامل

A Note on Recognizable Sets of Unranked and Unordered Trees

Recognizable sets of unranked, unordered trees have been introduced in Courcelle [C89] in a Myhill-Nerode [N58] style of inverse homomorphisms of suitable finite magmas. This is equivalent of being the the union of some congruence classes of a congruence of finite index. We will add to the well-known concept of regular tree grammars a handling of nodes labeled with ǫ. With this rather unconvent...

متن کامل

Equivalences between Ranked and Unranked Weighted Tree Automata via Binarization

Encoding unranked trees to binary trees, henceforth called binarization, is an important method to deal with unranked trees. For each of three binarizations we show that weighted (ranked) tree automata together with the binarization are equivalent to weighted unranked tree automata; even in the probabilistic case. This allows to easily adapt training methods for weighted (ranked) tree automata ...

متن کامل

Finite automata on unranked trees: extensions by arithmetical and equality constraints

The notion of unranked trees has attracted much interest in current research, especially due to their application as formal models of XML documents. In particular, several automata and logic formalisms on unranked trees have been considered (again) in the literature, and many results that had previously been shown for the ranked-tree setting have turned out to hold for the unranked-tree setting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.05490  شماره 

صفحات  -

تاریخ انتشار 2018